Learning to Rank Answers to Non-Factoid Questions from Web Collections

نویسندگان

  • Mihai Surdeanu
  • Massimiliano Ciaramita
  • Hugo Zaragoza
چکیده

This work investigates the use of linguistically motivated features to improve search, in particular for ranking answers to non-factoid questions. We show that it is possible to exploit existing large collections of question–answer pairs (from online social Question Answering sites) to extract such features and train ranking models which combine them effectively. We investigate a wide range of feature types, some exploiting natural language processing such as coarse word sense disambiguation, named-entity identification, syntactic parsing, and semantic role labeling. Our experiments demonstrate that linguistic features, in combination, yield considerable improvements in accuracy. Depending on the system settings we measure relative improvements of 14% to 21% in Mean Reciprocal Rank and Precision@1, providing one of the most compelling evidence to date that complex linguistic features such as word senses and semantic roles can have a significant impact on large-scale information retrieval tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning to Rank Answers on Large Online QA Collections

This work describes an answer ranking engine for non-factoid questions built using a large online community-generated question-answer collection (Yahoo! Answers). We show how such collections may be used to effectively set up large supervised learning experiments. Furthermore we investigate a wide range of feature types, some exploiting NLP processors, and demonstrate that using them in combina...

متن کامل

A Pattern Based Approach to Answering Factoid, List and Definition Questions

Finding textual answers to open-domain questions in large text collections is a difficult problem. In this paper we concentrate on three types of questions: factoid, list, and definition questions and present two systems that make use of regular expression patterns and other devices in order to locate possible answers. While the factoid and list system acquires patterns in an off-line phase usi...

متن کامل

Beyond Factoid QA: Effective Methods for Non-factoid Answer Sentence Retrieval

Retrieving finer grained text units such as passages or sentences as answers for non-factoid Web queries is becoming increasingly important for applications such as mobile Web search. In this work, we introduce the answer sentence retrieval task for non-factoid Web queries, and investigate how this task can be effectively solved under a learning to rank framework. We design two types of feature...

متن کامل

Topic indexing and retrieval for open domain factoid question answering

Factoid Question Answering is an exciting area of Natural Language Engineering that has the potential to replace one major use of search engines today. In this dissertation, I introduce a new method of handling factoid questions whose answers are proper names. The method, Topic Indexing and Retrieval, addresses two issues that prevent current factoid QA system from realising this potential: The...

متن کامل

Automatic Question Answering: Beyond the Factoid

In this paper we describe and evaluate a Question Answering system that goes beyond answering factoid questions. We focus on FAQlike questions and answers, and build our system around a noisy-channel architecture which exploits both a language model for answers and a transformation model for answer/question terms, trained on a corpus of 1 million question/answer pairs collected from the Web.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Linguistics

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2011